CPAR: Classification based on Predictive Association Rules
نویسندگان
چکیده
Recent studies in data mining have proposed a new classification approach, called associative classification, which, according to several reports, such as [7, 6], achieves higher classification accuracy than traditional classification approaches such as C4.5. However, the approach also suffers from two major deficiencies: (1) it generates a very large number of association rules, which leads to high processing overhead; and (2) its confidence-based rule evaluation measure may lead to overfitting. In comparison with associative classification, traditional rule-based classifiers, such as C4.5, FOIL and RIPPER, are substantially faster but their accuracy, in most cases, may not be as high. In this paper, we propose a new classification approach, CPAR (Classification based on Predictive Association Rules), which combines the advantages of both associative classification and traditional rule-based classification. Instead of generating a large number of candidate rules as in associative classification, CPAR adopts a greedy algorithm to generate rules directly from training data. Moreover, CPAR generates and tests more rules than traditional rule-based classifiers to avoid missing important rules. To avoid overfitting, CPAR uses expected accuracy to evaluate each rule and uses the best k rules in prediction.
منابع مشابه
A Study of Associative Classifiers with Different Rule Evaluation Measures for Tuberculosis Prediction
Tuberculosis (TB) is a disease caused by bacteria called Mycobacterium tuberculosis. It usually spreads through the air and attacks low immune bodies such as patients with Human Immunodeficiency Virus (HIV). Association Rule Mining (ARM) is one of the most popular approaches in data mining and if used in the medical domain has a great potential to improve disease prediction. This results in lar...
متن کاملFCP-Growth: Class Itemsets for Class Association Rules
Since the first work of (Liu, Hsu, & Ma 1998), various works show the good performance of associative classification (association based classification) in terms of error rate reduction. Association classification deals with the prediction of the class from association rules, known as class association rules or predictive association rules. A class association rule is a rule whose consequent mus...
متن کاملFast rule-based bioactivity prediction using associative classification mining
Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of methods, called associative classification mining (ACM), which are popular in the data mining comm...
متن کاملReview and Comparison of Associative Classification Data Mining Approaches
Associative classification (AC) is a data mining approach that combines association rule and classification to build classification models (classifiers). AC has attracted a significant attention from several researchers mainly because it derives accurate classifiers that contain simple yet effective rules. In the last decade, a number of associative classification algorithms have been proposed ...
متن کاملReview and Comparison of Associative Classification Data Mining Approaches
Associative classification (AC) is a data mining approach that combines association rule and classification to build classification models (classifiers). AC has attracted a significant attention from several researchers mainly because it derives accurate classifiers that contain simple yet effective rules. In the last decade, a number of associative classification algorithms have been proposed ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003